Image Extracting from Ancient Arab Documents with Complex Structures

نویسندگان

  • Mohamed Aymen Charrada
  • Najoua Essoukri Ben Amara
چکیده

In this paper, we give an overview on approaches for the graphic extraction, and especially for the image extraction from printed documents. We present also our contribution to this area operating on historical Arab periodicals. The developed method works on monochrome documents and is based on the Gabor filter exploration followed by many post-processing steps. It allows firstly the text/graphic separation then it ensures the distinction between images and other graphic classes (drawings, textured titles). The various tests and experiments were performed on an image database obtained from historical documents with complex structures, collected from various newspapers coming from the National Archives of Tunisia. The obtained results and the comparison carried out with another existing approach show many interesting perspectives and prove that our approach is able to attain high level of acceptability.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Apport du traitement des images à la numérisation des documents manuscrits anciens

Image processing is often necessary for extracting the content of ancient documents. We present here techniques for restoring images and removing noise, extracting document structures (separating graphical elements and illustrations from text, extracting text lines) and, when possible, recognizing the textual or musical symbols which may be present in the image. These techniques, which are clas...

متن کامل

Contourlet-Based Edge Extraction for Image Registration

Image registration is a crucial step in most image processing tasks for which the final result is achieved from a combination of various resources. In general, the majority of registration methods consist of the following four steps: feature extraction, feature matching, transform modeling, and finally image resampling. As the accuracy of a registration process is highly dependent to the fe...

متن کامل

Blind Source Separation Techniques for Detecting Hidden Texts and Textures in Document Images

Blind Source Separation techniques, based both on Independent Component Analysis and on second order statistics, are presented and compared for extracting partially hidden texts and textures in document images. Barely perceivable features may occur, for instance, in ancient documents previously erased and then re-written (palimpsests), or for transparency or seeping of ink from the reverse side...

متن کامل

Digital Image Enhancement using Normalization Techniques and their application to Palm Leaf Manuscripts

Palm leaves were one of the earliest forms of writing media and their use as writing material in South and Southeast Asia has been recorded from as early as the fifth century B.C. until as recently as the late 19th century. Palm leaf manuscripts relating to art and architecture, mathematics, astronomy, astrology, and medicine dating back several hundreds of years are still available for referen...

متن کامل

ارائه مدلی برای استخراج اطلاعات از مستندات متنی، مبتنی بر متن‌کاوی در حوزه یادگیری الکترونیکی

As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that discoveries unknown information, facts or new hypotheses by automatically extracting information from different written documents. T...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014